Skip to content

Conversation

sirmax
Copy link
Contributor

@sirmax sirmax commented Aug 8, 2025

When calculating a random insertion index for an element Gen.pick uses wrong order of operations

val i = (x & Long.MaxValue % count).toInt

As written, it first gets the modulo MaxValue & count and then uses it to mask a random x to get a positive value. Instead it should mask the x first, and only then get a modulo of it.

This PR fixes the order of operations.

@satorg satorg requested a review from Copilot August 10, 2025 01:20
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a bug in the Gen.pick method where incorrect operator precedence caused the random insertion index calculation to produce a biased distribution, resulting in only a subset of possible combinations being generated.

  • Fixes operator precedence in random index calculation by adding parentheses around (x & Long.MaxValue)
  • Adds a test to verify that Gen.pick produces the expected number of distinct combinations

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
core/shared/src/main/scala/org/scalacheck/Gen.scala Fixes operator precedence bug in random index calculation
core/jvm/src/test/scala/org/scalacheck/GenSpecification.scala Adds test to verify correct number of distinct combinations are generated

Copy link
Contributor

@satorg satorg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you!

Looks like Gen.pick didn't work properly all that time.


val genPick = pick(n, 1 to nElements)
// not interested in different permutations, only in distinct combinations
.map(_.toList.sorted)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what is the reason for converting the output of pick to List before sorting it? It doesn't seem it is supposed to improve performance or affects the test in any other way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a leftover from my attempts at writing this test. Removed it.

I think, at some point I had sets and also tried to collect the stats on those. It went into .toString on sets which, amusingly, may be different for equal small sets (up to 4 elements). Thus, it ended up with aSet.toList.sorted and later got scrapped, but toList somehow survived 😅

// that every possible combination appears at least once. It depends on
// nSamples, but there is no specific math behind it. Rather, it's just an
// empirically chosen value, that has yielded no failures over thousands of
// test cycles.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…over thousands of test cycles.

Specifically this

+ Gen.pick produces enough distinct combinations: OK, passed 10000000 tests

@satorg
Copy link
Contributor

satorg commented Aug 12, 2025

I believe it is good to go, thanks!

@satorg satorg merged commit 38559e0 into typelevel:main Aug 12, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants